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The Internet infrastructure is severely stressed. Rapidly growing overheads associated with the 
primary function of the Internet — routing information packets between any two computers in the 
world — cause concerns among Internet experts that the existing Internet routing architecture may 
not sustain even another decade. Here we present a method to map the Internet to a hyperbolic 
space. Guided with the constructed map, which we release with this paper, Internet routing exhibits 
scaling properties close to theoretically best possible, thus resolving serious scaling limitations that 
the Internet faces today. Besides this immediate practical viability, our network mapping method 
can provide a different perspective on the community structure in complex networks. 



I. INTRODUCTION 

In the Information Age, the Internet is becoming a de 
facto public good, akin to roads, airports, or any other 
critical infrastructure [1]. More than a billion people are 
estimated to use the Internet every day to communicate, 
search for information, share data, or do business [2]. 
On-line social networks are becoming an integral part 
of human social activities, increasingly affecting human 
psychology [3]. Underlying all these processes is the In- 
ternet infrastructure, composed, at the large scale, of 
connections between Autonomous Systems (ASs). An 
AS is, roughly, a part of the Internet owned and admin- 
istered by the same organisation [4]. ASs range in size 
from small companies, or even private users, to huge in- 
ternational corporations. There is no central Internet 
authority dictating to any AS what other ASs to con- 
nect to. Connections between ASs are results of local 
independent decisions based on business agreements be- 
tween AS pairs. This lack of centralised engineering con- 
trol makes the Internet a truly self-organised system, and 
poses many scientific challenges. The one we address here 
is the sustainability of Internet growth. 

The Internet has been growing fast according to all 
measures [5, 6]. For example, the number of ASs in- 
creases by approximately 2,400 every year [5]. Despite 
its growth, the Internet must sustainably perform its pri- 
mary task — routing information packets between any two 
computers in the world. But can this function be really 
sustained? To route information to a given destination in 
the Internet today, all ASs must collectively discover the 
best path to each possible destination, based on the cur- 
rent state of the global Internet topology. As the number 
of destinations grows quickly, the amount of information 
each AS has to maintain becomes a serious scalability 
concern, endangering the performance and stability of 
the Internet [7]. Worse yet, the Internet is not static. 
Its topology changes constantly due to failures of exist- 
ing links and nodes, or appearances of new ones. Each 
time such a change occurs anywhere in the Internet, the 



information about this event must be diffused to all ASs, 
which have to quickly process it to recompute new best 
routes. The constantly increasing size and dynamics of 
the Internet thus leads to immense and quickly growing 
routing overheads, causing concerns among Internet ex- 
perts that the existing Internet routing architecture may 
not sustain even another decade [7-10]; parts of the In- 
ternet have started sinking into black holes already [11]. 

The scaling limitations with existing Internet routing 
stem from the requirement to have a current state of 
the Internet topology distributed globally. Such global 
knowledge is unavoidable since routing has no source of 
information other than the network topology. Routing 
in these conditions is equivalent to routing using a hy- 
pothetical road atlas, which has no geographic informa- 
tion, but just lists road network links, which are pairs of 
connected road intersections, abstractly identified. This 
analogy with road routing suggests that there are bet- 
ter ways to find paths in networks. Suppose we want to 
travel from one geographic place to another. Given the 
geographic coordinates of our starting point and destina- 
tion, we can readily tell what direction brings us closer 
to our destination. We see that a coordinate system in 
a geometric space, coupled with a representation of the 
world in this space, simplify drastically our routing task. 
For simple and efficient network routing we thus need 
a map. Constructing such a map for the Internet boils 
down to assigning to each AS its coordinates in some 
geometric space, and then using this space to forward 
information packets in the right directions toward their 
destinations. Greedy forwarding implements this routing 
in the right direction: upon reading the destination ad- 
dress in the packet, the current packet holder forwards 
the packet to its neighbour closest to the destination in 
the space. This greedy strategy to reach a destination 
is efficient only if the network map is congruent with 
the network topology. In the analogy with road rout- 
ing, for example, this congruency condition means that 
there should exists a road path that stays approximately 
close to the geographic geodesic between the trip's start- 
ing and ending points. If the congruency condition holds, 



2 



then the advantage of greedy forwarding is twofold. First, 
the only information that ASs must maintain is the co- 
ordinates of their neighbours. That is, ASs do not have 
to keep any per-destination information. Second, once 
ASs are given their coordinates, these coordinates do not 
change upon topological changes of the Internet. There- 
fore, ASs do not have to exchange any information about 
ever-changing Internet topology. Taken together, these 
two improvements essentially eliminate the two scaling 
limitations mentioned above. 

In our recent work [12-15] we have shown that greedy 
forwarding is indeed efficient in Internet-like synthetic 
networks embedded in geometric spaces, and that this 
efficiency is maximised if the space is hyperbolic. How- 
ever, putting these ideas in practice needs a crucial piece 
of information: a map of the real Internet in a hyperbolic 
space. Here we present a method to find such a map. 

Our method uses statistical inference techniques to find 
coordinates for each AS in the hyperbolic space under- 
lying the Internet. Guided by the inferred coordinates, 
greedy forwarding in the Internet achieves efficiency and 
robustness, similar to those in synthetic networks. We 
also find that the method maps geo-politically close ASs 
close to each other in the hyperbolic space. This finding 
suggests that our mapping method can be used for soft 
community detection in real networks, where by soft com- 
munities we mean groups of geometrically close nodes. 



II. THE MODEL 

To build a geographic map, one first has to model the 
Earth surface, e.g., by assuming that it is a sphere. Sim- 
ilarly, we also need a geometric model of the Internet 
space to build our map. The simplest candidate space 
is also a sphere, or even a circle, on which nodes are 
uniformly distributed, and connected by an edge with 
probability p(d) decreasing as a function of distance d be- 
tween nodes, conceptually similar to random geometric 
graphs [16]. However, this model fails to capture basic 
properties of the Internet topology, including its scale- 
free node degree distribution. In [17], we showed that 
to generate realistic network topologies in this geometric 
approach, we first have to assign to nodes their expected 
degrees k drawn from a power-law distribution, and then 
connect pairs of nodes with expected degrees k and k' 
with probability p(x), where \ is distance d rescaled 
by the product of the expected degrees, x ~ d/{nK'). 
We thus have a hybrid model that mixes geometry and 
topology — geometric characteristics, distances d used in 
random geometric graphs, come in tandem with topolog- 
ical characteristics, expected degrees k used in classical 
configuration models of random power-law graphs [18]. 
If we associate the expected degree k of a node with its 
mass, then the connection probability p{d/(nn')), which 
is a measure of the interaction strength between two 
nodes, resembles Newton's law of gravitation. Therefore 
we call this model Newtonian. However, according to 



Einstein, we can treat gravity in purely geometric terms 
if we accept that the space is no longer flat, i.e., if it 
is non-Euclidean. Following this philosophy we showed 
in [19, 20] that the Newtonian model is isomorphic to a 
purely geometric network model with node degrees trans- 
formed into a geometric coordinate making the space hy- 
perbolic, i.e., negatively curved. We call this model Ein- 
steinian. 

The main property of hyperbolic geometry is the ex- 
ponential expansion of space illustrated in Fig. 1. For 
example, the area A(r) of a two-dimensional hyperbolic 
disc of radius r grows with r as A(r) <~ e r . Consequently, 
if we distribute nodes uniformly or quasi-uniformly over 
a hyperbolic disc, then from the Euclidean perspective 
their density will grow exponentially with the distance 
from the disc centre. We illustrate this effect in Fig. 2, 
where we visualise a small-size sample network gener- 
ated by our Einsteinian model. In the model, nodes are 
indeed distributed (quasi-)uniformly within a hyperbolic 
disc of radius R, which is a function of the network size. 
We see that the angular node density appears uniform, 
but the radial one does not — the number of nodes grows 
exponentially as we move away from the origin. The fig- 
ure also shows a triangle connecting origin O, and two 
nodes a and b by hyperbolic geodesies, i.e., hyperboli- 
cally straight lines. The two geodesies emanating from 
the origin O, Oa and Ob, are radial straight lines, and 
their hyperbolic lengths x are equal to the radial coordi- 
nates of a and 6: xo a = r a and xob — fb- However, the 
hyperbolic geodesic between nodes a and b does not ap- 
pear as a Euclidean straight line, and its length is given 
by the hyperbolic law of cosines 

cosh x a b = cosh r a cosh r& — sinh r a sinh r& cos A9 a b, (1) 

where A8 ab is the angle between Oa and Ob. (The same 
formula with ro = can be used to compute xo a = r a 
and xob — H-) Upon distributing nodes over the disc as 
described, we form scale- free networks in the model by 
connecting each pair of nodes i and j located at hyper- 
bolic distance Xij with the connection probability 

p(xij)= (l + < ■ ■ ") , (2) 

almost identical to the Fermi-Dirac distribution in statis- 
tical mechanics. It depends only on hyperbolic distances 
x^ (link energies) , hyperbolic disc radius R (chemical po- 
tential), and parameter T (temperature) controlling 
network clustering. This connection probability results 
in average node degrees exponentially decreasing with 
the distance from the origin, which we also observe in 
Fig. 2. The combination of an exponentially increasing 
node density and exponentially decreasing average de- 
gree yields a power-law node degree distribution in the 
network. See Appendix A for further details. 
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FIG. 1: The exponentially growing number of people lying on 
the hyperbolic floor illustrates the exponential expansion of 
hyperbolic space. All people are of the same hyperbolic size. 
The Poincare tool developed by Bill Horn is used to construct 
the tessellation of the hyperbolic plane in the Poincare disc 
model with the Schlafli symbol {9, 3}, rendering an image of 
the last author. 

III. THE MAPPING METHOD 



As our goal is to build a realistic Internet map, ready 
for routing and other applications, we have to find for 
each AS its radial and angular coordinates (r, 9) max- 
imising the efficiency of greedy forwarding. This spe- 
cific task of maximising greedy forwarding efficiency calls 
for a mapping method different from existing techniques 
on embedding Internet distances and graphs [21-23]. In 
view of our previous findings [12-15] that greedy forward- 
ing is exceptionally efficient in Internet-resembling syn- 
thetic networks, and that this efficiency is maximised in 
the Einstcinian model, our strategy for the Internet map 
construction is to maximise the congruency between the 
map and the model. In statistical inference [24], this 
goal is equivalent to maximising the likelihood that the 
observed data, i.e., the Internet topology, has been pro- 
duced by the model. This likelihood is given by 



c=i[p(x ij r^[i-p(x ij )} 1 - 



(3) 



where the elements of the Internet adjacency matrix 
are equal to 1 whenever there exists a connection between 
ASs i and j, and to otherwise. While the adjacency ma- 
trix represents the observed data, the connection proba- 
bility p(xij) depends via Eqs. (2,1) on the AS coordinates 




FIG. 2: Connection between hyperbolic geometry and scale- 
free topology of complex networks illustrated by a synthetic 
network in the Einsteinian model. All nodes lie within a hy- 
perbolic disc of radius R. The radial density of nodes grows 
exponentially with the distance from the origin O, while their 
average degree exponentially decreases, yielding a scalc-frcc 
degree distribution. The red lines show triangle Oab made of 
the hyperbolic geodesies connecting origin O and two nodes a 
and 6. Geodesies Oa and Ob are the solid lines, while geodesic 
ab is the dashed curve. The thick blue links show the shortest 
path between nodes a and b in the network. 



(r, 6), which we try to infer. Our best estimate for these 
coordinates are then those maximising the likelihood in 
Eq. (3). 

Although there are a plenty of methods to find 
maximum- likelihood solutions, e.g., the Metropolis- 
Hastings algorithm [25], they perform poorly and do not 
scale well on large datasets with abundant local maxima, 
which is the case with the Internet. Therefore, as impor- 
tant as a maximisation method is a heuristic approach 
helping the maximisation algorithm to find the optimal 
solution in a reasonable amount of time and with rea- 
sonable computational resources. Our method is based 
on the following remarkable property of networks in our 
model; the same property holds for the Internet [17]. Let 
Q be a given network with average degree k and power- 
law degree distribution P(k) ~ k" 1 , and let Q{k T ) be 
G's subgraph composed of nodes with degree larger than 
some threshold fey, along with the connections among 
these nodes. The average degree in (?(fey) is then given 
by k(kr) = fc T _7 fe [17]. In scale-free networks with ex- 
ponent 7 between 2 and 3, this internal average degree 
is thus a growing function of fey, which implies that sub- 
graphs made of high degree nodes almost surely form a 
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FIG. 3: The hyperbolic map of the Internet is similar to a synthetic Einsteinian network in Fig. 2. The size of AS nodes is 
proportional to the logarithm of their degrees. For the sake of clarity, only ASs with degree above 3, and only the connections 
with probability p(x) > 0.5 given by Eq. (2) are shown. The font size of the country names is proportional to the logarithm of 
the number of ASs that the country has. Only the names of countries with more than 10 ASs are included. The methods used 
to map ASs to their countries are described in Appendix D. 



single connected component. Using this property along 
with the statistical independence of the graph edges, it 
becomes possible to infer coordinates of ASs in G(kr) 
ignoring the remainder of the AS graph. This property 
is practically important because the size of G(kx) de- 
creases very fast as kx increases, which speeds up likeli- 
hood maximisation algorithms tremendously. In a nut- 
shell, our method starts with a subgraph G{kr) small 
enough for standard maximisation algorithms being able 
to reliably and quickly infer the coordinates of ASs in 



G(kT). Once these are found, we gradually increase fc-r 
to iteratively add layers of lower-degree ASs. While doing 
so, we use the already inferred AS coordinates as a ref- 
erence frame to assign initial coordinates to newly added 
ASs. This initial coordinate assignment significantly im- 
proves the convergence time of maximisation algorithms. 
All other details of our mapping method can be found in 
Appendix B. 
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FIG. 4: Hyperbolic mapping of the Internet is successful, 
as the empirical connection probability between ASs of de- 
gree larger than 2 in the map closely follows the Einsteinian 
model prediction. The whole range of hyperbolic distances 
x £ [0, 2_R] is binned, and for each bin the ratio of the number 
of connected AS pairs to the total number of AS pairs falling 
within this bin is shown. The distances between AS pairs are 
computed using Eq. (1). The blue dashed line is the connec- 
tion probability given by Eq. (2) with R = 27 and T = 0.69, 
which are the values used by the mapping method. 



IV. MAPPING RESULTS 

We apply our mapping method to the Internet 
AS topology extracted from the Archipelago project 
data [26] in June 2009 and described in Appendix C, and 
visualise the results in Fig. 3. We observe striking sim- 
ilarity between this visualisation and the synthetic Ein- 
steinian network in Fig. 2. To confirm that the Internet 
map we have obtained is indeed congruent with the Ein- 
steinian model, we juxtapose in Fig. 4 the empirical con- 
nection probability between ASs in the obtained Internet 
map against the theoretical one in Eq. (2). We observe 
a clear similarity between the two. Neither the sphere is 
a perfect model of the Earth, nor the Einsteinian model 
is an ideal abstraction of the Internet structure. Yet, 
the observed similarity between the empirical and the- 
oretical connection probabilities in Fig. 4 suggests that 
hyperbolic metric spaces coupled with Fermi-like connec- 
tion probabilities are reasonable representations of the 
real Internet space. 

To investigate further the connections between the ob- 
tained map and Internet reality, we show in Fig. 3 the 
average angular position of all ASs belonging to the same 
country, while in Fig. 5 we draw the angular distri- 
butions of those ASs. Surprisingly, we find that even 
though our mapping method is completely geography- 
agnostic, it discovers meaningful groups or communities 
of ASs belonging to the same country. Furthermore, in 
Fig. 3 we find many cases of geographically or politically 
close countries placed close to each other in our hyper- 
bolic map. The explanation of these surprising effects is 
rooted in the peculiar nature of our mapping method. If 
ASs belonging to the same country, geographic region, 
or geo-political or economic group are connected more 
densely to each other than to the rest of the world, then 
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FIG. 5: Hyperbolic mapping of the Internet yields meaningful 
results, as ASs belonging to the same country are mapped 
close to each other. The angular distributions of ASs in the 
thirty largest countries in the world are shown. The "size" of 
the country is the number of ASs it has. The graph shows 
the percentage of ASs per bin of size 3.6°. For the majority 
of countries, their ASs are localised in narrow regions. The 
exceptions are the US, EU, and UK. The first two exceptions 
are due the significant geographic spread of ASs belonging to 
the US or EU, the latter actually representing not one country 
but a collection of countries. 



this higher connection density translates to a higher at- 
tractive force that tries to place all such ASs close to 
each other in our map. Indeed, the term p(xij) aij in 
Eq. (3) corresponds to the attractive force between con- 
nected nodes, while the term [1 — p(xij)] 1 ~ aij is the re- 
pulsive force between disconnected ones. This peculiar 
interplay between attraction within densely connected re- 
gions, and repulsion across sparsely connected zones, ef- 
fectively maps closely the ASs belonging to densely con- 
nected AS groups. These observations build our confi- 
dence that our mapping method provides meaningful re- 
sults reflecting peculiarities of the real Internet structure, 
and suggest that the method can be adapted to discover 
the community structure [27-29] in other complex net- 
works. 
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V. ROUTING RESULTS 

The obtained Internet map is ready for greedy for- 
warding. An AS holding a packet reads its destination 
AS coordinates, computes the hyperbolic distances be- 
tween this destination and each of its AS neighbours us- 
ing Eq. (1), and forwards the packet to the neighbour 
closest to the destination. To evaluate the performance 
of this process, we perform greedy forwarding from each 
source to each destination AS, and compute several per- 
formance metrics. 

The first metric is success ratio, which is the percentage 
of greedy paths that successfully reach their destinations. 
Not all paths are expected to be successful as some might 
run into local minima. For example, an AS might forward 
a packet to its neighbour who sends the packet back to 
the same AS, in which case the packet will never reach 
the destination. We declare a path unsuccessful, if the 
packet is sent to the same AS twice. The average success 
ratio of simple greedy forwarding in our Internet map 
is remarkably high, 97%, and more sophisticated greedy 
forwarding techniques, such as those described in [30], 
can boost it to 100%. 

Given the discussed connections between our Internet 
map and geography, one may conjecture that greedy for- 
warding simply mimics geographic routing following the 
geographically shortest paths. However, this conjecture 
is not true. Geography is reflected in our map only along 
the angular coordinate, while the radial coordinate is a 
function of the AS degree, making the space hyperbolic, 
see Appendix A. The geographic space is not hyperbolic, 
and if we use it for greedy forwarding, we obtain a much 
lower success ratio of approximately 14%. We also tested 
modified geographic routing that tries to intelligently use 
AS degrees, in spirit of our Einsteinian model. Neverthe- 
less, this modification, although improving the success 
ratio to 30%, still fails short compared to the results ob- 
tained using our hyperbolic map. The details of these 
experiments with geographic routing are in Appendix E. 

The second metric is stretch, which tells us by how 
much longer the greedy paths are, compared to short- 
est paths in the Internet topology. The average stretch 
is low, 1.1. The average hop- wise length of the shortest 
paths between selected sources and destinations is 3.49, 
so that the average length of greedy paths is 3.86. The 
low value of stretch indicates that greedy paths are close 
to optimal, i.e., shortest paths. The shortest path be- 
tween nodes a and b in Fig. 2, for example, is also the path 
found by greedy forwarding. Somewhat unexpectedly, 
the greedy stretch is asymptotically optimal, i.e., equal 
to 1, in scale-free, strongly clustered networks regard- 
less what underlying space is used for greedy forward- 
ing [13]. Low stretch also implies that greedy forwarding 
causes approximately the same traffic load on nodes as 
shortest-path forwarding. Given that shortest-path for- 
warding does not lead to high traffic load in scale-free 
networks [31], this finding allays concerns that hyper- 
bolic forwarding may cause traffic congestion abnormal- 
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FIG. 6: Greedy forwarding performs almost optimally in the 
mapped Internet, as indicated by the success ratio, p s , and av- 
erage stretch, s, after removal of a given fraction of AS nodes 
(top left) or links (top right). The bottom plots show these 
two metrics after removing a number of the highest-degree 
nodes (bottom left), and a fraction of links among highest- 
degree nodes (bottom right). The links are first ranked by 
the product of node degrees that they connect, and then a 
fraction of top-ranked links are removed. The giant connected 
component is still present after all removals, but it drops to 
85% of the original graph after the removal of 10 hubs. 



ities [32]. More details on this topic are in Appendix F. 

The two metrics above characterise the performance 
of greedy forwarding in the static Internet topology. 
More important than that is how greedy forwarding per- 
forms in the dynamic topology, where links and nodes 
can fail. We randomly select a percentage of links and 
nodes, remove them from the mapped Internet, recom- 
pute the success ratio and stretch after the removal, 
and present the result in the top plots of Fig. 6. Even 
upon simultaneous failures of up to 10% of AS links or 
nodes — catastrophic events never happened in the Inter- 
net history — we observe only minor degradation of the 
performance of greedy forwarding. That is, even catas- 
trophic levels of damage to the Internet does not signifi- 
cantly affect the performance of greedy forwarding, even 
though no AS changes its position on the hyperbolic map. 

A widely popularized feature of complex networks is 
their robustness with respect to random failures, and the 
lethality of failures of highest-degree hubs [33, 34]. As 
expected we observe in the bottom plots of Fig. 6 that 
removals of such hubs have a more detrimental effect on 
greedy forwarding as well. However, targeted removal of 
highest-degree ASs in the Internet is a rather unrealis- 
tic scenario since these large ASs consist of thousands of 
routers whose simultaneous failure is a very rare and un- 
likely event. The explanation for the surprising efficiency 
of greedy forwarding with respect to random failures lies 
in the unique combination of the following two properties 
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exhibited by scale-free, strongly clustered networks: high 
path diversity [31], and congruency between hyperbolic 
geodesies and topologically shortest paths [15, 19, 20]. 
The latter is illustrated by the similar path patterns of 
the hyperbolic geodesic and topologically shortest path 
between nodes a and b in Fig. 2: they both first go to 
the high-degree core of the network, and then exit it in 
the appropriate direction to the destination. Due to high 
path diversity, there are many disjoint shortest paths be- 
tween the same source and destination, and thanks to 
the congruency, they all stay close to the corresponding 
hyperbolic geodesies. Link and node failures affect some 
shortest paths, but others remain, and greedy forwarding 
can still find them using the same hyperbolic map. 

Another form of Internet dynamics is its rapid growth 
over years [5, 6, 35, 36]. We show in Appendix G that if 
the existing ASs keep their hyperbolic coordinates fixed, 
while the ASs joining the Internet anew compute their co- 
ordinates using local information, then the performance 
of greedy forwarding does not significantly degrade, even 
at long time scales. In a nutshell, the existing AS coor- 
dinates are essentially static, as they can stay the same 
for years. 

Existing Internet topology measurements including the 
Archipelago data [26] are known to be incomplete and 
miss some AS links. Therefore a natural question is how 
this missing information affects the quality of the con- 
structed map, and the performance of greedy forward- 
ing in it. Intuitively, since the performance of greedy 
forwarding is robust with respect to link removals, then 
we might expect it to be robust with respect to miss- 
ing links as well. Moreover, if the constructed map is 
used in practice, then greedy forwarding will see and use 
those links that topology measurements do not see. We 
might thus also intuitively expect greedy forwarding to 
perform better in practice than we report in this section, 
simply because those missing links, when used by greedy 
forwarding, would provide additional shortcuts between 
potentially remote ASs. We confirm this intuition in Ap- 
pendix H with experiments emulating the missing link 
issue. Therefore the routing results reported here should 
actually be considered as lower bounds for greedy rout- 
ing performance that can be achieved in practice using 
the constructed hyperbolic Internet map. 



VI. CONCLUSION 

We have constructed a hyperbolic map of the Internet, 
and release this map with this paper [37]. The map can 
be used for essentially infinitely scalable Internet rout- 
ing. The amount of routing information that ASs must 
maintain is proportional to the AS degree, which is the- 
oretically best possible since ASs must always keep some 
information about their neighbours. Routing commu- 
nication overheads are also minimised, since ASs do not 
exchange any routing information upon dynamic changes 
of the AS topology. The presented solution thus achieves 



routing efficiency close to theoretically optimal, and re- 
solves serious scaling limitations that the Internet faces 
today. 



The mapping method we have employed is generic, and 
can be applied to other complex networks with under- 
lying metric structures and heterogeneous degree distri- 
butions. We showed in [17] that a good indicator for 
the presence of an underlying metric structure is self- 
similarity of clustering in the network, while in [19, 20] 
we showed that as soon as a metric space is present, and 
the network has a heterogeneous degree distribution, the 
metric distances can be rescaled such that the underlying 
geometry is effectively hyperbolic. Roughly, self-similar 
clustering is responsible for the metric structure along 
the angular coordinate, while degree heterogeneity adds 
the radial dimension, and makes the space hyperbolic. 
Applied to other networks, our mapping method can pro- 
vide a different perspective on the community structure 
in networks. Instead of trying to split nodes into discrete 
community sets [27-29], it would naturally yield a con- 
tinuous measure of similarity between nodes based on hy- 
perbolic distances. More similar nodes would be located 
closer to each other, and form zones of higher connectiv- 
ity density. It would be then up to an experimenter to 
define communities, if needed, as histograms of the node 
density in the hyperbolic space. The spectrum of poten- 
tial applications of this network-mapping geometrisation 
agenda is wide. Network mapping can reveal geomet- 
ric forces effectively driving information signaling in the 
network; examples include the brain [38] and cell signal- 
ing networks [39] . One can then potentially predict what 
network perturbations drive these networks to failures, 
such as brain disorders or cancer. Other applications 
range from recommender systems [40] , where to have the 
right measure of similarity between consumers is a key, 
to epidemic spreading [41] and information theory of net- 
works [42]. 



We have shown that the Internet hyperbolic map is 
remarkably robust with respect to even substantial per- 
turbations of the Internet topology, implying that this 
map is essentially static. It does not significantly de- 
pend on topology dynamics, and can thus be computed 
only once. This property is desirable in view of long 
running times intrinsic to likelihood maximisation algo- 
rithms. Our method improves their running times dras- 
tically, and the Internet map computations take approx- 
imately a day on a modern computer. However, for sub- 
stantially larger networks the running times may still be 
prohibitive even for one-time mapping. Therefore, alter- 
native methods for network mapping, not relying on like- 
lihood maximisation, are highly desirable, and our work 
in this direction is underway. 
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Appendix A: The Einsteinian and Newtonian 
models of complex networks 

To synthesise a network with our Einsteinian model, 
one has first to specify any desired network size N, 
average degree k, average clustering C, and exponent 
7 > 2 of the power-law distribution P(k) of node de- 
grees k, P(k) ~ fc -7 . Equipped with these target prop- 
erties of the network topology, we first distribute quasi- 
uniformly TV nodes within a hyperbolic disc of radius 
R = 2 log (JV/c), where c is given by 



c = k 



sin irT f 7 — 2 



2T 



(Al) 



and T £ [0, 1] is a function of C. In the hyperbolic plane, 
the quasi-uniform node density means that the node an- 
gular coordinates 9 G [0, 2tt\ are distributed uniformly, 
while their radial coordinates r £ [0, R] arc distributed 
with density 



p(r) 



ae 



a(r-R) 



(A2) 



where a = (7— 1)/2. Once all nodes are in place specified 
by their assigned coordinates, the hyperbolic distance x^ 
between each pair of nodes i and j located at (rj, 9i) and 
(r?> 6j) is computed using Eq. (1). Given these distances, 
each pair of nodes i and j is then connected by a link 
with probability p(xij) given by Eq. (2). After each node 
pair is examined and connected with probability p(xij), 
the network is formed, and we can compute the average 
degree k(r) of nodes located at distance r from the origin. 
The result is 



k(r) — k 



7 



,(R-r)/2 



7- 



(A3) 



which combined with Eq. (A2) yields the target degree 
distribution P(k). The Newtonian model is isomorphic 
to the Einsteinian one via a simple change of variables 
reminiscent to Eq. (A3): 



re e 



(R-r)/2 



(A4) 



where re is the expected degree of a node in the Newtonian 
model, and kq is the minimum expected degree. See [19, 
20] for further details. 



Appendix B: Mapping methods 

To find our hyperbolic Internet map, we use the 
equivalence between the Einsteinian-H 2 [19, 20] and the 
Newtonian-S 1 [17] models. This equivalence establishes 
a relationship in Eq. (A4) between the expected degree re 
of a node in the Newtonian-S 1 model, and its radial co- 
ordinate r in the Einsteinian-H 2 model The angular co- 
ordinate 9 is the same in both models. Thus, for a given 
node i we aim to find its expected degree and angular 



coordinate, {re^,^}, that best match the Newtonian-S 1 
model. We then use the re-to-r mapping to place nodes 
in the hyperbolic plane according to the Einstcinian-H 2 
model. 

Thanks to their equivalence, the Newtonian-S 1 and 
Einsteinian H 2 models generate statistically the same 
network topologies. However, the efficiency of greedy 
forwarding in the Einstcinian-H 2 model is higher, be- 
cause hyperbolic geodesies are exceptionally congruent 
with the topologically shortest paths in scale-free net- 
works [14, 19, 20]. The reason for this congruency is 
that the effective distance used as an argument of the 
connection probability in the Newtonian-S 1 is actually 
hyperbolic [19, 20], and the Einsteinian-H 2 model sim- 
ply translates this effective distance to the real hyper- 
bolic one. For these reasons we prefer the Einsteinian- 
H 2 model for routing purposes, although we use the 
Newtonian-S 1 one to find the Internet map. We could 
use directly the Einsteinian-H 2 model for this purpose, 
but the Newtonian-S 1 model is technically simpler since 
the statistical inference in it can be performed indepen- 
dently for the two variables re and 9. 

We first recall the Newtonian-S 1 model, which gener- 
ates networks according to the following steps: 

1. Distribute N nodes uniformly over the circle S 1 of 
radius N/(2n), so that the node density on the cir- 
cle is fixed to 1 [66]. 

2. Assign to all nodes a hidden variable re representing 
their expected degrees. To generate scale-free net- 
works, re is drawn from the power-law distribution 



P(k) 

K 



(7-l)re 7 , ree[re ,oo), (Bl) 



- 7-2 
1' 



7' 



(B2) 



where Ko is the minimum expected degree, and k is 
the network average degree [67]. 

3. Let re and re' be the expected degrees of two nodes 
located at distance d = NA9/(2n) measured over 
the circle, where A9 is the angular distance be- 
tween the nodes. Connect each pair of nodes 
with probability p(x), where the effective distance 
X = d/(fj,KK'), and fi is a constant fixing the average 
degree. 

The connection probability p(x) can be any integrable 
function. Here we chose the Fermi-Dirac distribution 



p(x) = 



1 



(B3) 



where /3 = 1/T is a parameter that controls clustering in 
the network. With this connection probability, parame- 
ter jU becomes 



J_ 
2nk 



sin 



(B4) 
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The expected degree of a node with hidden variable k is 
k(n) — k and, therefore, the degree distribution scales as 
P(k) <~ fc~ 7 for large k. 

To go from the Newtonian-S 1 to the Einstenian-H 2 
models, we leave the angular coordinate 9 unchanged, 
while the radial coordinate of a node with expected de- 
gree K is given by 



R - 2 In 



K 



(B5) 



where the radius of the hyperbolic disk containing all 
nodes is 



R = 2 In 



N 



(B6) 



1. General theory behind likelihood maximization 

We now fit the real AS graph to the model. Specifically, 
given the measured AS graph, we aim to find the set of 
coordinates i = 1, • • • , N, that best match the 

Newtonian-S 1 model in a statistical sense. To do so, we 
use maximum likelihood estimation (MLE) techniques. 
Let us compute the posterior probability, or likelihood, 
that a network given by its adjacency matrix is gen- 
erated by the Newtonian-S 1 model, £(aij\'j, (3, k). This 
probability is 

£{aijh,P,k) = ■■ C{a ij ,{n u l }\-f 1 l3,k)Yl d ^ dK ^ 
J i=i 

(B7) 

where function {k^, #i}|7, k) within the integral 

is the joint probability that the model generates the adja- 
cency matrix ay, and the set of hidden variables #,}. 
Using Bayes' rule, we find the likelihood that the hidden 
variables take particular values {ni,9i\ in the network 
given by its observed adjacency matrix 



Prob({^, 6>J)£(a tJ ;\{k z , 6>J, 7, /3, k) 
£(aij|7,/?,fc) 



(B8) 



where 



N 



Prob(K,^}) = -^n P (K i ) (B9) 

i— 1 

is the prior probability of the hidden variables given by 
the model, 

Cioij^KiAh 7, /?, k) = \[pi\,,r [1 - pix^)} 1 -^ 

i<j 

(BIO) 



is the likelihood of finding a,ij if the hidden variables are 
{k,i, 9i}, and 



Xij — 



NkA9, 



f3 sin (n/ (3) KiKj ' 



A9ij = 7T — 1 7T - 



/j VjU . 



(Bll) 



(B12) 



The MLE values of the hidden variables {k*, 9*} are then 
those that maximize the likelihood in Eq. (B8) or, equiv- 
alently, its logarithm, 



N 



In £(•{>;, 6>J| aii , 7, /3,fc) = C-7^1n^+ 
+ ^2a ij lnp(xi j ) + ^(1 - aij)ln[l - p(xij)], (B13) 

i<j i<j 

where C is independent of Ki and 9i. 

2. MLE for expected degrees n 

The derivative of Eq. (B13) with respect to expected 
degree k; of node I is 



d_ 



\nC({K i ,9 l }\a lj ,-f,f3,k) 



(B14) 



The first term within the parenthesis is the expected de- 
gree of node I, while the second term is its actual degree 
ki . Therefore, the value k ; * that maximizes the likelihood 
is given by 



7 



(B15) 



Since n[ can be smaller than n in the last equation, we 
set 



, 7 - 2 r 7 
k, = max -k, h- - 

\7-l P 



(B16) 



We discuss a correction of this equation accounting for 
finite size effects is Section B 4 b. 



3. MLE for angular coordinates 9 

Having found the MLE values for expected degrees k, 
we now have to maximize Eq. (B8) with respect to angu- 
lar coordinates 9. This task is equivalent to maximizing 
the partial log-likelihood 



In £(a,ij\{K*,6i}, 7, /3, k) 
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= a l3 \np{xij) + ]T(1 - an) In [1 - p{ Xij )}. (B17) 

i<j i<j 

The first term in this equation involves only pairs of con- 
nected nodes, whereas the second term accounts for pairs 
of disconnected ones. Since the connection probability 
p(x) is a monotonously decreasing function of the effec- 
tive distance x> the first term in Eq. (B17) is large if pairs 
of connected nodes are placed close to each other. In 
contrast, the second term is large if pairs of disconnected 
nodes arc far apart. Therefore the optimal MLE solution 
will balance both effects, and place connected nodes as 
close as possible while keeping disconnected ones as far 
as possible. 

Unfortunately, the maximization of Eq. (B17) with re- 
spect to the angular coordinates cannot be performed 
analytically. We thus have to rely on approximations. 
At their core lie MLE algorithms, or kernels, which we 
discuss first. We present two such kernels, standard 
Metropolis-Hastings (SMH) [25], and our "localized" ver- 
sion of it (LMH). 



a. MLE kernels 

In the standard Metropolis-Hastings (SMH) al- 
gorithm, a node is chosen at random, and given a 
new angular position chosen uniformly in the interval 
[0, 27r]. The change is accepted whenever the likelihood in 
Eq. (B17) computed after the change, C new [68], is larger 
than the likelihood computed with the old coordinate, 
C id- Otherwise, the change is accepted with probabil- 
ity Cnew I Coid- The SMH algorithm samples the angular 
phase space, and produces angular configurations with a 
probability proportional to the likelihood. The SMH's 
computational complexity depends on a particular sys- 
tem to which SMH is applied. We find that in our case 
the number of node moves sufficient for SMH to converge 
is 0(N 2 ), meaning that the total running time complex- 
ity is 0(N 3 ), since each move attempt involves the 0(N) 
computation of the likelihood change. 

Our localized Metropolis-Hastings (LMH) algo- 
rithm is not MH per se. In fact it bears stronger re- 
semblances to extremal optimization and genetic search 
algorithms than to MH. We first define the local contri- 
bution In Ci of node i to the total log- likelihood In C in 
Eq. (B17): 

lnC t = Y^ a *J ln f>(Xij) + X^ 1 ~ ln t 1 ~ P(Xij)], 

(B18) 

so that ln£ = 1/2 ^-ln/^. We can interpret function 
In Ci as the fitness of node i, which we can then use to 
maximize the total likelihood. Specifically, in LMH nodes 
are visited in rounds, and during each round all nodes 
are visited one by one. At each node visit, the node is 
moved to the angular position that maximizes its fitness 
In Ci, having fixed the positions of all other nodes at that 



particular node visit. An example of the log-likelihood 
landscape that a node sees during its move is shown in the 
top plot of Fig. 7. The total number of rounds of all-node 
visits needed for LMH to converge is of the order of the 
network average degree. Indeed, even though after each 
node move, the fitness of other nodes changes, the node 
fitness is mostly affected by changes of coordinates of 
the node neighbors, whose average number thus roughly 
determines the number of rounds. The maximization of 
the fitness of a node takes 0(N 2 ) time, having fitness 
ln£j sampled at intervals with A9 = 1/N. Therefore for 
sparse graphs, the overall computational complexity of 
LMH is 0(N 3 ). 

Applied to the real Internet and synthetic Internet-like 
networks below, both SMH and LMH yield similar good 
results. However, we prefer LMH since by its localized 
nature, it can be implemented in a distributed manner, 
an important property for deployment in the real Inter- 
net. Even more importantly, with LMH, new-coming ASs 
can compute their coordinates in a distributed manner 
without knowing the global Internet topology. Indeed, to 
compute its coordinates using Eq. (B18), a new-coming 
AS i has to know the status of connections only to its 
neighbors; the status of connections between any two ASs 
other than i docs not contribute to ln£, in Eq. (B18). 
All results shown in this paper are for LMH. 



b. First MLE wrapper 

If we naivly applied any MLE kernel to the Internet, we 
would have to wait forever for good results. We see in the 
top plot of Fig. 7 that the characteristic likelihood profile 
has abundant local maxima. Therefore an MLE kernel is 
not guaranteed to converge to the global maximum in a 
reasonable amount of time. It is thus imperative to find 
a heuristic procedure, i.e., an MLE wrapper, helping an 
MLE kernel to find its way towards the global maximum 
without being trapped in local maxima. This procedure 
is equivalent to using all available information to make 
an educated guess of the initial node coordinates. 

Our MLE wrapping strategy is based on statistical in- 
dependence of edges in our graphs. Thanks to this in- 
dependence, the coordinates of a set of nodes can be in- 
ferred based only on the partial information contained 
in a subgraph formed by the nodes in the set, ignoring 
the rest of the network. Consider a small subgraph of 
the whole network, for our purposes made of high de- 
gree nodes, and remove all nodes and connections not 
belonging to this subgraph. Since edges in this subgraph 
are statistically independent of other edges, we can max- 
imize the likelihood corresponding to the subgraph, and 
infer the coordinates of the nodes in it based only on this 
partial information. If the subgraph is small and dense 
enough, finding the optimal MLE solution is easy. Once 
this solution is found, we can add more nodes to the net- 
work, and use the previously inferred coordinates as the 
initial configuration for the new MLE problem. However, 
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FIG. 7: Top: example of the local log-likelihood of node i 
(Eq. (B18)), having the coordinates of all other nodes fixed 
at an intermediate step of the mapping process. Bottom: in- 
ferred angular coordinates vs. real ones for the 250 most con- 
nected nodes in a synthetic § network with the same prop- 
erties as the real Internet: 7 = 2.1, /3 = 2, TV w 24000, k » 5. 
The LMH kernel and Alg. 1 are used for coordinate inference. 



this method works only if the subgraph forms a single 
connected component. This property holds for synthetic 
networks in our model, and for the real Internet [17]. 

Formally, let k%, k%, ■ ■ ■ , k m , with k\ > &2 > • • • > k m , 
be a set of predefined degrees, and let G(ki), I = 1, • • • , m, 
be the subgraphs formed by all nodes of degrees larger or 
equal to ki, plus all connections among them. We thus 
have Q(kx) C Q(k2) C • •• C Q{k m ), forming a hierarchy 
of nested subgraphs. The main idea behind our MLE 
wrapper is to run the MLE kernel, either SMH or LMH, 
in iterations, starting with the smallest subgraph, and 
feeding the coordinates inferred at each iteration to the 
MLE kernel at the next iteration. 

This idea must be implemented with care. First, sub- 
graph Q{k\) is different from other subgraphs. Indeed, 
in scale-free networks, all nodes of degrees larger than 
~ N 1 / 2 are connected almost surely. Therefore all such 
nodes would appear identical to the MLE kernel, which 
would thus place them all at the same location, some- 



Algorithm 1 First MLE wrapper 
activate nodes in Q(k\) 

assign random angular coordinates to nodes in Q(k\) 
remove links among nodes in Q(k\) 
for I = 2 to (# of layers) do 

for j = 1 to (# of nodes in Q{ki) not active) do 
i label of new node in Q(ki) not active 
if # of connections of i with nodes in Q{ki-x) > 2 
then 

activate new node i 

assign to node i coordinate 9i maximizing 

InAl0(*!-i)] 
end if 
end for 

run the MLE kernel on the set of active nodes 
end for 



thing that we have to avoid. To solve this problem, we 
remove all connections among nodes of degree larger or 
equal to k\ ~ TV 1 / 2 and start the wrapper algorithm 
with the G(k2) iteration. Second, iterating from Q{k{) 
to £7(fc/+i), we still need to specify the initial coordi- 
nates of the nodes that belong to G{ki+\) but not to 
Q(ki) [69]. While the assignment of random coordinates 
to new nodes is possible, it is much more efficient to try 
to maximize the likelihood in Q(ki+\) from the very be- 
ginning. In other words, we assign to each new node 
i € Q(ki + i) \ G(ki) the coordinate maximizing 

In A [£(&;)]= (B19) 

= Y a iJ ln P(xij)+ ( 1_a u) ln [ 1_ Mxii)]- 

j'ea(fc0 jeg(h) 

We note that node i uses information contained only in 
G(ki) to get its initial coordinate. After all new nodes 
corresponding to a given iteration are introduced and 
assigned initial coordinates, we apply the MLE kernel 
to the resulting system. This heuristic MLE wrapping 
procedure is summarized in Alg. 1. 

In the bottom plot in Fig. 7 we show the test results 
for this procedure wrapping the LMH kernel, applied to 
a synthetic Newtonian-S 1 network generated with the 
parameters similar to the real AS graph. We observe 
that the inferred coordinates are very close to the real 
ones, except for a global phase shift, which can take any 
value in [0, 2ir] due to the rotational symmetry of the 
model. 



c. Second MLE wrapper 

As mentioned above, it is not necessary to consider the 
full graph to infer the coordinates of the most connected 
nodes. We now use this observation to speed up the 
mapping process significantly. Specifically, we run our 
first MLE wrapper up to a subgraph of a certain size, 
and then add the rest of the nodes assigning to them 
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FIG. 8: Top left: the same as in the bottom plot of Fig. 7 
for nodes of degrees k > 20. All other plots show the results 
of adding layers of lower degree nodes using the second MLE 
wrapper with ki aritia = 20. According to Alg. 2, nodes in 
newly added layers infer their coordinates using the coordi- 
nates of nodes in existing layers, without running the MLE 
kernel, and the coordinates of existing nodes do not change 
as new nodes are added. 



Algorithm 2 Second MLE wrapper 
activate nodes in Q(k\) 

assign random angular coordinates to nodes in Q(k\) 
remove links among nodes in G(ki) 
Icritic <— maximum layer with full MLE calculations 
for I = 2 to (# of layers) do 

for j = 1 to (# of nodes in Q(ki) not active) do 
i label of new node in Q(ki) not active 
if # of connections of i with nodes in <5(/c;_i) > 2 
then 

activate new node i 

assign to node i coordinate 0i maximizing 

iaCiig(ki-x)] 

end if 
end for 

if I < l cr itic then 

run the MLE kernel on the set of active nodes 
end if 
end for 




FIG. 9: Empirical connection probability based on the in- 
ferred node coordinates, compared to the connection proba- 
bility used to generate the synthetic network. The inset shows 
the details in the small distance region. 



their coordinates maximizing Eq. (B19) without subse- 
quent running the MLE kernel, see Alg. 2. 

This modification speeds the overall mapping process 
because once the coordinates of the coordinates of a rel- 
ative small number of high degree nodes are inferred, the 
rest of the process takes 0{N 2 ) steps to complete. This 
improvement reduces the total running time of the Inter- 
net mapping to few hours on a standard computer. An- 
other practically important feature of this second MLE 
wrapper is that new-coming ASs compute their coordi- 
nates without existing ASs changing their coordinates. In 
other words, once the AS coordinates are inferred, they 
stay static as the Internet grows. 

We apply this procedure up to nodes of degree 3. 
Nodes of degree 2 and 1 must be analyzed separately 
since all nodes of degree 1, and 40% of nodes of degree 2 
do not form any triangles. As a consequence, the MLE 
kernel cannot reliably infer their metric attributes, i.e., 



their coordinates. Therefore we assign to these nodes 
the angular coordinate of their (highest-degree) neigh- 
bors, which makes sense, especially for nodes of degree 1, 
since the only path to such nodes is via their neighbors. 
Forwarding to such nodes is thus equivalent to forwarding 
to their neighbors. 

The test results of this second MLE wrapper are shown 
in Fig. 8. The top left plot shows the inferred vs. real co- 
ordinates in the same synthetic network for nodes with 
degrees k > 20 using the first MLE wrapper. The other 
plots show the corresponding coordinates for nodes with 
degrees larger than or equal to 8, 6, 5, 4, 3 using the sec- 
ond MLE wrapper with h crMc = 20. That is, the MLE 
kernel is not run for these nodes. We observe that the 
inference quality does deteriorate for smaller degrees, but 
it is remarkable that even in the worst case a majority of 
coordinates are correctly inferred. 
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As an additional test, we show in Fig. 9 the empirical 
connection probability among nodes in each subgraph us- 
ing the coordinates inferred by the second MLE wrapper, 
compared to the connection probability p(x) = (1+x 2 )^ 1 
used to generate the network. We observe a good agree- 
ment for high degree subgraphs, which slightly deterio- 
rates for low degree nodes located at large effective dis- 
tances x- 

To map the AS graph, we used the LMH kernel 
wrapped with the second MLE wrapper with ki critic = 20 
and the subgraph hierarchy defined by k\ — 300, fc 2 = 
200, k 3 = 160, k 4 = 130, k 5 = 110, fc 6 = 100, k 7 = 90, 
k 8 = 80, k 9 = 70, fcio = 60, k n = 50, k 12 = 40, k 13 = 30, 
k 14 = 20, fci 5 = 10, h e = 9, fci 7 = 8, fci 8 = 7, fcig = 6, 
k 2 o = 5, fc 2 i = 4, k 2 2 = 3. 



is 



(k(N)} = k 1 



= ka(n , k c ). (B21) 



In the thermodynamic limit k c — > oo and q:(ko, k c ) - > 1- 
However, if 7 is close to 2, the approach to these limits is 
slow, and we have to take care of finite size corrections. 

Accounting for these corrections, the expected degree 
of a node with hidden variable n becomes 



k N (n) = a(«0j «c)K) 



(B22) 



with fcoo(«;) = k. This equation implies that the MLE of 
the hidden variable k of a node of degree k changes from 
Eq. (B16) to 



4. Parameter estimation and finite size effect 

Our model has three parameters: the exponent 7 of 
the degree distribution, the average degree k, and the 
exponent f3 of the connection probability. 



a. Estimating 7 

We estimate the exponent 7 via the direct inspection 
of the degree distribution, yielding 7 = 2.1. 



b. Estimating N and k 

The estimation of N and k is more involved for two 
reasons. First, the Newtonian-S 1 model generates nodes 
of zero degree which are included in the computation 
of the average degree, k = ^2 k=0 kP(k). However, in 
the real Internet graph all nodes have non-zero degrees. 
Therefore we first have to estimate the number of nodes 
N in the model, based on the number of nodes N obs we 
observe in the real graph. The relationship between the 
two numbers is 



N = 



N, 



6 s 



1-P(0) : 



(B20) 



where P(0) is the probability that a node has zero degree 
in the model. 

The second complication is due to finite size effects. 
These effects are particularly important when the expo- 
nent 7 is close to 2, which is the case with the Internet. 
Suppose we generate a finite size network of N nodes with 
our Newtonian-S 1 model with parameters 7, k, and j3. 
Since the network is finite, there is a cut-off value for the 
expected degree of a node, n c , which depends on the size 
of the network. The first moment of the distribution of 
expected degrees p(n) — Kq 1 ( r y — 1)k -7 with this cut-off 



max 



7- 



-k, 



1 



a(K 0l n c ) 



k 



(B23) 



while the average degree including zero degree nodes in 
a finite size network becomes 



k N = [a(n , K c )] 2 k. 



(B24) 



If the average degree observed in the real Internet graph 
is k b s , our estimate of the parameter k is then 



k = 



1 - P(0) 
'a(K ,K c )] 2 



^obs ■ 



(B25) 



Therefore, in order to estimate the values of N and k for 
a finite network, we first have to estimate the values of 
P(0) and a(no,K c ). One can check [17] that 

P(°) = (7 - l)[a(«0) K c )K ] 7 ~ 1 r(l - 7, a(« , k c )«o), 

(B26) 

where T(x, y) is the incomplete Gamma function. We can 
also relate the maximum degree fc^ x observed in the real 
Internet to the expected degree cut-off k c via 



l, max 
ft o6s 



a(«o, k c )k c . 



(B27) 



We thus have six unknown values — namely, N, P(0), 
Ko, «c, a^Oj^c), and k — and the system of six equa- 
tions (B2,B20,B21,B25,B26,B27) involving them. Sub- 
stituting into these equations the given values of A f, s = 
23752, k obs = 4.92, k^ x = 2778, and 7 = 2.1 ob- 
served in the Internet, we compute numerically ko = 0.9, 
k c = 4790, a{na 2 k c ) = 0.58, and P(0) = 0.33, yielding 
N = 35685 and k = 9.86. 



c. Estimating f3 

To estimate /?, we first compare clustering in syn- 
thetic networks with different /3's to the clustering ob- 
served in the Internet, keeping all other parameters fixed. 
This procedure narrows down the possible values of /3 to 
j3 € [1,2]. We then generate Internet maps for different 
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FIG. 10: Success ratio of greedy forwarding as a function of 
P for the Internet graph mapping. 



values of j3 within this range, and perform hyperbolic 
greedy forwarding in them. Fig. 10 shows the success 
ratio of greedy forwarding as a function of (3 in this re- 
gion. We observe that the success ratio increases as j3 
decreases, and then sharply drops at /3 ~ 1. The value of 
j3 maximizing the success ratio is /3 = 1.45, and we used 
this value in our final Internet map. 



no longer map to a single country. If we ignored such 
ASs, the angular distributions of the remaining ASs be- 
longing to a given country would be even more localised, 
including the US, EU, and UK ASs. In our hyperbolic 
map data, we release the AS-to-country associations us- 
ing both methods, IP+WHOIS-based and IP-based. The 
latter has no country information for many ASs with con- 
flicting country mappings. 



Appendix E: Geographic routing 

To perform standard geographic routing we first map 
each AS to a collection of geographic locations (char- 
acterised by their latitudes and longitudes) using the 
IP-based method, and then find the centre of mass for 
each collection. We thus obtain unique geographic coor- 
dinates for each AS. We then perform standard greedy 
forwarding over the AS topology, computing geographic 
distances between ASs using the spherical law of cosines. 
For hyperbolised geographic routing, we keep the AS an- 
gular coordinates equal to their geographic coordinates, 
but also, based on the AS degree, we assign to each AS 
a radial coordinate, according to the relationship be- 
tween node degrees and radial positions in the three- 
dimensional Einsteinian model, and then perform greedy 
forwarding in this three-dimensional hyperbolic space. 



Appendix C: The Archipelago Internet topology 

We use the AS Internet topology of June 2009 ex- 
tracted from the data collected by the Archipelago ac- 
tive measurement infrastructure (ARK) developed by 
CAIDA [26]. The AS topology contains 23752 ASs and 
58416 AS links, yielding the average AS degree k = 4.92. 
The maximum AS degree is fc max = 2778. The average 
clustering measured over ASs of degree larger than 1 is 
C = 0.61, yielding temperature T = 0.69, and hyper- 
bolic disc radius R = 27. The exponent of the power-law 
AS degree distribution is 7 = 2.1. This Internet topol- 
ogy, along with the hyperbolic Internet map, are released 
with this paper [37]. 



Appendix D: Mapping AS's to countries 

The AS-to-country mapping is taken from the CAIDA 
AS ranking project [43]. It uses two methods for this 
task. The first method is IP-based. It splits the IP ad- 
dress space advertised by an AS into small blocks, and 
then maps each block to a country using [44] . If not all IP 
blocks of an AS map to the same country, then the other, 
WHOIS-based method is used, which reports the coun- 
try where the AS headquarters are located according to 
the WHOIS database [45] . Since large ASs have points of 
presence in many countries, they tend to map to multiple 
countries using the IP-based method. Therefore, if we did 
not apply the WHOIS-based method to them, they would 



Appendix F: Traffic and congestion considerations 

In this section we measure a proxy for the amount 
of traffic that ASs would have to handle under greedy 
forwarding. 

In view of our finding that greedy forwarding follows al- 
most always the shortest paths, we expect that the traffic 
load on an AS under greedy forwarding is essentially the 
same as under shortest path forwarding. We confirm this 
expectation in Fig. 11 where we juxtapose the normalized 
betweennesses corresponding to shortest path and greedy 
forwarding. To compute normalized betweenness, we se- 
lect a large number of source/destination AS pairs chosen 
uniformly at random among all ASs. We then find two 
paths for each AS pair using shortest path and greedy 
forwarding. Normalized betweenness of a given AS is 
then the fraction of all paths going through this AS. We 
observe in Fig. 11 that the normalized betweennesses for 
shortest path and greedy forwarding are almost identi- 
cal as expected. We also observe in the top plot, that 
in agreement with the previous studies on this subject, 
e.g. [46], the normalized betweenness grows as a power 
law of the AS degree. This observation may create an 
impression that high-degree ASs may suffer from traffic 
congestion problems. However, this impression is wrong 
not only because of the results in [31], but also because 
of the following considerations. 

In the real Internet, ASs are not singular nodes but dif- 
ferently sized networks composed of (many) routers. The 
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FIG. 11: Top: standard normalized betweenness as a func- 
tion of the AS degree for shortest path forwarding and greedy 
forwarding. The solid line is a power law of exponent 1.2. Bot- 
tom: normalized betweenness divided by number of routers 
in the AS, with source and destination ASs chosen with a 
probability proportional to the number of routers in them. 



size of an AS, measured by the number of routers in it, 
is roughly proportional to the AS degree [47, 48] . ASs of 
different size generate and consume different volumes of 
traffic. Also, a larger AS can handle larger transit traffic 
volumes without being congested. These two observa- 
tions suggest the following modifications to the top plot 
in Fig. 11. First, we model traffic with the more realistic 
assumption that the amount of traffic an AS generates or 
consumes is proportional to the AS size. That is, instead 
of choosing source and destination AS pairs at random, 
we chose each AS with a probability proportional to the 
number of routers in the AS using the data from [48]. 
Second, we divide the normalized betweenness value for 
each AS by the number of routers in the AS, thus es- 
timating the per-router traffic load. The result shown 
in the bottom plot of Fig. 11 demonstrates that the im- 
portant large ASs are, in fact, less prone to congestion 
problems. 



Appendix G: Dealing with new-coming AS's 

In this section we show that if the existing ASs keep 
their hyperbolic coordinates fixed, while the ASs joining 
the Internet anew over years compute their coordinates 
in a localizing manner, i.e., using the LMH kernel (B18), 
then the performance of greedy forwarding does not sig- 
nificantly degrade, even at long time scales. 

To demonstrate this we perform the following experi- 
ment. We replay the AS Internet growth from January 
2007 to June 2009 similar to [5]. Specifically, we obtain 
11 lists of ASs observed in the Internet at different dates 
as described in [5]. The AS lists are linearly spaced in 
time with the interval of three months: time t = cor- 
responds to January 2007, t = 1 is April 2007, and so 
on until t = 10, June 2009. We denote the obtained AS 
lists by A t . The number of ASs in A Q is 17258, while 
the numbers of new ASs in Af with tf = 1, 2, . . . , 10, but 
not in A are 806, 1614, 2389, 3103, 3973, 4794, 5434, 
5843, 6207, and 6426. We then take our Archipelago AS 
topology [26] of June 2009, and for each t = 0, 1, . . . , 10 
we remove from it all ASs and their adjacent links that 
are not in A t , thus obtaining a time series of historical 
AS topologies G t . We then embed Go using the SMH 
kernel (B17), but for each subsequent embedding of G t > 
with t' > 0, we keep the hyperbolic coordinates of ASs 
in Gt with t < t' fixed, and compute coordinates for the 
new ASs using the LMH kernel (B18). That is, once an 
AS appears at some time t ^ and gets its coordinates 
computed, using either the SMH, t — 0, or LMH, t > 0, 
computations, the AS then never changes its coordinates 
for the rest of the observation period. In Fig. 12 we show 
the average success ratio p s and stretch s for greedy for- 
warding in Gt- 

Remarkably, we observe only minor variations of suc- 
cess ratio and stretch over more than 2.5 years of rapid 
Internet growth. The success ratio does decrease, but by 
less than 1%. We thus conclude that greedy forwarding 
using our hyperbolic AS map is quite robust with respect 
to Internet historical growth. Existing ASs do not have 
to recompute their hyperbolic coordinates when new ASs 
join the Internet. Recomputations of all AS coordinates 
may be executed to improve the greedy forwarding per- 
formance, but the time scale for such recomputations ex- 
ceeds the time scale of Internet historical evolution, i.e., 
years, thus exceeding by orders of magnitude the time 
scale of transient dynamics of failing AS links and nodes, 
i.e., seconds or minutes. That is why the existing AS 
coordinates are essentially static, and can stay the same 
for years. 

Appendix H: Sensitivity to missing links 

It is widely known that the existing measurements of 
the Internet topology miss a number of AS links [43, 49, 
50]. However, in view of the robustness of greedy for- 
warding performance with respect to link removals, one 
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FIG. 12: Average success ratio p s and stretch s for greedy 
forwarding in the AS Internet growing from January 2007 
(t = 0) to June 2009 (t = 10) with linear time steps (At = 1) 
of three months, e.g., t = 1 corresponds to April 2007. 



FIG. 13: Success ratio as a function of the fraction of removed 
links among nodes of degree above 5 for the two scenarios 
described in the text. Note that 30% of missing links in this 
subgraph corresponds to 14% of the total number of links in 
the network. 



could expect that its performance would be robust with 
respect to missing links as well. Furthermore, if our hy- 
perbolic map is used in practice, then greedy forwarding 
will see and use those links that we do not see. There- 
fore one can intuitively expect that, in this case, the effi- 
ciency of greedy forwarding will be actually higher than 
we report in this paper, simply because these links that 
we miss but greedy forwarding would not miss, would 
provide additional shortcuts between potentially remote 
ASs. If so, the routing results presented in this paper 
should be considered as lower bounds. 

To confirm this intuition, we perform the following ex- 
periment. It is known that the majority of missing links 
in the Internet are peer-to-peer links among provider ASs 
of moderate size [43, 50]. To emulate the missing link is- 
sue, we thus remove a fraction (ranging from 0% to 30%) 
of links among nodes with degree above a certain thresh- 
old (k — 5) from our AS graph. We then map these 
graphs with different numbers of emulated missing links 
to H 2 as described in Section B to find hyperbolic co- 
ordinates for each AS. Using these maps with missing 
information, we then consider two different greedy for- 
warding scenarios for each map: 

1. In the first scenario, we navigate an AS graph 
mapped with a fraction of links removed, and com- 
pute the success ratio of greedy forwarding in the 
graph. This scenario tries to mimic the missing 
links issue directly. We have incomplete topology 
measurements of the real Internet, but we have no 
other option as to use these measurements to map 
the Internet to its hyperbolic space, and study nav- 
igability with this map, which we know miss some 
information. 

2. In the second scenario, we use the hyperbolic map 
obtained with missing links, but we then add back 
those removed links, and navigate the complete 
graph. This scenario is motivated by the observa- 



tion that even though our map is constructed with 
some links missing, these missing links will still be 
used by ASs attached to them to forward informa- 
tion if this map is used in practice. 

The results of these two scenarios are shown in Fig. 13. 
As intuitively expected, our mapping is quite robust with 
respect to missing links: the success ratio decreases by 
less than 5% even if up to 14% of links are removed 
from the topology before we map it. Also as expected, 
the missing links, when added back, increase the suc- 
cess ratio. That is, even though the map has been con- 
structed using partial information, navigability improves 
when missing links are considered. These results confirm 
that the routing results reported in this paper are in real- 
ity lower bounds for the success ratio that can be achieved 
if our map is used in practice. In fact, one may some- 
what paradoxically expect that the more links are missed 
in the measured Internet topology we used for mapping, 
the better the success ratio would be in practice, since 
according to Fig. 13, the success ratio improvement due 
to re-adding of removed links tends to increase with the 
number of removed links. 



Appendix I: Comments on AS-level routing 

Our approach belongs to a wide class of approaches 
proposing to reduce routing granularity to the level of 
Autonomous Systems [51-63]. The key difference be- 
tween ours and the existing approaches in this class is 
that the latter require some form of routing on the dy- 
namic AS graph. As soon as the AS topology changes, 
new AS routes must be recomputed, so that routing com- 
munication overhead is unavoidable in this case. In our 
case such recomputations are not needed since as we have 
shown, the efficiency of greedy forwarding sustains in 
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presence of failing AS nodes and links, even though ASs 
do not exchange any information about topology modifi- 
cations, and do not change their hyperbolic coordinates, 
i.e., even though they do not incur any communication 
overhead. A bulk of routing overhead in the Internet to- 
day is due to traffic engineering and multihoming in the 
first place [64, 65]. How the AS-level routing class of ap- 
proaches helps to deal with and reduce this overhead is 
discussed in the literature cited above. 
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